The Gene Ontology Handbook by Christophe Dessimoz & Nives Škunca

The Gene Ontology Handbook by Christophe Dessimoz & Nives Škunca

Author:Christophe Dessimoz & Nives Škunca
Language: eng
Format: epub, pdf
Publisher: Springer New York, New York, NY


5 Discussion

Critical assessment challenges have been successfully adopted in a number of fields due to several factors. First, the recognition that improvements to methods are indeed necessary. Second, the ability of the community to mobilize enough of its members to engage in a challenge. Mobilizing a community is not a trivial task, as groups have their own research priorities and only a limited amount of resources to achieve them, which may deter them from undertaking a time-consuming and competitive effort a challenge may pose. At the same time, there are quite a few incentives to join a community challenge. Testing one’s method objectively by a third party can establish credibility, help point out flaws, and suggest improvements. Engaging with other groups may lead to collaborations and other opportunities. Finally, the promise of doing well in a challenge can be a strong incentive heralding a group’s excellence in their field. Since the assessment metrics are crucial to the performance of the teams, large efforts are made to create multiple metrics and to describe exactly what they measure. Good challenge organizers try to be attentive to the requests of the participants, and to have the rules of the challenge evolve based on the needs of the community. An understanding that a challenge’s ultimate goal is to improve methodologies and that it takes several rounds of repeating the challenge to see results.

The first two CAFA challenges helped clarify that protein function prediction is a vibrant field, but also one of the most challenging tasks in computational biology. For example, CAFA provided evidence that the available function prediction algorithms outperform a straightforward use of sequence alignments in function transfer. The performance of methods in the Molecular Function category has consistently been reliable and also showed progress over time (unpublished results from CAFA2). On the other hand, the performance in the Biological Process or Cellular Component ontologies has not yet met expectations. One of the reasons for this may be that the terms in these ontologies are less predictable using amino acid sequence data and instead would rely more on high-quality systems data; e.g., see [6]. The challenge has also helped clarify the problems of evaluation, both in terms of evaluating over consistent sub-graphs in the ontology but also in the presence of incomplete and biased molecular data. Finally, although it is still early, some best practices in the field are beginning to emerge. Exploiting multiple types of data is typically advantageous, although we have observed that both machine learning expertise and good biological insights tend to result in strong performance. Overall, while the methods in the Molecular Function ontology seem to be maturing, in part because of the strong signal in sequence data, the methods in the Biological Process and Cellular Component ontologies still appear to be in the early stages of development. With the help of better data over time, we expect significant improvements in these categories in the future CAFA experiments.

Overall, CAFA generated a strong positive response to the call for both



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.